fix: sanitize store addrs in logs#8078
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the connection string sanitization logic to more robustly redact sensitive information like passwords from logs and debug outputs. It introduces the url crate for parsing URL-formatted strings and utilizes a regex for key-value pairs. The feedback suggests expanding the set of redacted keys to include 'token' and 'secret' for both URL query parameters and key-value connection strings.
| const REDACTED: &str = "***"; | ||
|
|
||
| static SENSITIVE_KV_RE: LazyLock<Regex> = LazyLock::new(|| { | ||
| Regex::new(r#"(?i)(^|\s)(password|pass|pwd)\s*=\s*('[^']*'|"[^"]*"|\S+)"#).unwrap() |
There was a problem hiding this comment.
The regex for sensitive key-value pairs could be more comprehensive. Common sensitive keys in connection strings also include token and secret. Adding these would improve the robustness of the sanitization.
| Regex::new(r#"(?i)(^|\s)(password|pass|pwd)\s*=\s*('[^']*'|"[^"]*"|\S+)"#).unwrap() | |
| Regex::new(r#"(?i)(^|\s)(password|pass|pwd|token|secret)\s*=\s*('[^']*'|"[^" ]*"|\S+)"#).unwrap() |
| fn is_sensitive_key(key: &str) -> bool { | ||
| matches!( | ||
| key.to_ascii_lowercase().as_str(), | ||
| "password" | "pass" | "pwd" |
Signed-off-by: Detachm <42765252+Detachm@users.noreply.github.com>
8f92df3 to
e56fc56
Compare
|
Updated the sanitizer to also redact |
There was a problem hiding this comment.
Pull request overview
This PR addresses sensitive credential leakage by sanitizing store_addrs before they are emitted in logs/debug output, improving redaction for both URL-style DSNs and key-value connection strings across metasrv/cmd/cli usage.
Changes:
- Strengthened
sanitize_connection_stringto redact URL userinfo and sensitive query/key-value parameters (e.g.,password,token,secret). - Updated CLI kvbackend construction log to print sanitized
store_addrs. - Added targeted tests ensuring
Debug/log output does not contain raw secrets forStartCommandandMetasrvOptions, plus expanded sanitizer unit tests.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/meta-srv/src/metasrv.rs | Adds a test ensuring MetasrvOptions debug output sanitizes store_addrs. |
| src/common/meta/src/kv_backend/util.rs | Reworks connection-string sanitization using url parsing + regex redaction; expands unit tests. |
| src/common/meta/Cargo.toml | Adds url dependency to support URL parsing in the sanitizer. |
| src/cmd/src/metasrv.rs | Adds a test ensuring StartCommand debug output sanitizes store_addrs. |
| src/cli/src/common/store.rs | Sanitizes store_addrs in kvbackend construction log; adds unit test for store addr sanitization. |
| Cargo.lock | Locks the added url dependency. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
MichaelScofield
left a comment
There was a problem hiding this comment.
Please fix the CI.
| let pairs = url | ||
| .query_pairs() | ||
| .map(|(key, value)| { | ||
| let value = if is_sensitive_key(&key) { | ||
| REDACTED.into() | ||
| } else { | ||
| value | ||
| }; | ||
| (key.into_owned(), value.into_owned()) | ||
| }) | ||
| .collect::<Vec<_>>(); | ||
|
|
||
| url.query_pairs_mut().clear().extend_pairs(pairs); |
There was a problem hiding this comment.
Is it possible to redact in place? Like this:
| let pairs = url | |
| .query_pairs() | |
| .map(|(key, value)| { | |
| let value = if is_sensitive_key(&key) { | |
| REDACTED.into() | |
| } else { | |
| value | |
| }; | |
| (key.into_owned(), value.into_owned()) | |
| }) | |
| .collect::<Vec<_>>(); | |
| url.query_pairs_mut().clear().extend_pairs(pairs); | |
| url | |
| .query_pairs_mut() | |
| .for_each_mut(|(key, value)| { | |
| if is_sensitive_key(&key) { | |
| *value = REDACTED.into() | |
| } | |
| }) |
There was a problem hiding this comment.
Thanks for the suggestion. I tried the in-place redaction approach, but url::Url::query_pairs_mut() returns a serializer and does not expose a mutable iterator over existing query pairs, so for_each_mut is not available in the current url API.
The current implementation keeps using url to parse the query pairs, redacts sensitive values, then clears and writes the pairs back via extend_pairs. This avoids hand-rolling query-string parsing while preserving the same redaction behavior.
I also checked the visible failing checks. They seem to come from an older cancelled workflow run, while the latest CI run for the current head commit has passed.
|
@Detachm The CI needs to be fixed. |
Closes #7525
Sanitizes
store_addrsbefore logging kvbackend construction and strengthens connection string sanitization for URL and key-value formats.The fix keeps useful host/database information in logs while redacting credentials such as URL userinfo and password-like key-value fields.
Validation:
cargo test -p common-meta test_sanitize_connection_string --quietcargo test -p cli test_sanitize_store_addrs --quietcargo test -p cmd test_start_command_debug_sanitizes_store_addrs --quietcargo test -p meta-srv test_metasrv_options_debug_sanitizes_store_addrs --quietcargo fmt --all -- --checkgit diff --check